# Efficient Video Processing
Vamba Qwen2 VL 7B
MIT
Vamba is a hybrid Mamba-Transformer architecture that achieves efficient long video understanding through cross-attention layers and Mamba-2 modules.
Video-to-Text
Transformers

V
TIGER-Lab
806
16
Videochat Flash Qwen2 7B Res224
Apache-2.0
A multimodal model built on UMT-L and Qwen2-7B, supporting long video understanding with only 16 tokens per frame and an extended context window of 128k.
Video-to-Text
Transformers English

V
OpenGVLab
80
6
Featured Recommended AI Models